The presence of mixed type texts in a document image is an important obstacle towards the automation of the optical character recognition procedure. Machine printed character recognition and handwritten character recognition techniques

نویسندگان

  • Faria da Silva
  • Aura Conci
  • Angel Sanchez
چکیده

552 Abstract—In many documents such as admission form, bank cheques, memorandums, letters and application forms machine printed and handwritten characters are mixed. Since the algorithms for recognition of machine-printed texts and handwritten texts are different, it is necessary to distinguish between these two types of texts before giving it to respective OCR systems to process it. This separation will definitely increase the performance and overall system quality. The paper discusses some observations about characteristics of these two types of texts and various techniques of separation of machine printed and handwritten text into three categories (Structural and statistical features, Gradient features and Geometric features) based on feature extraction method.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Neural Network Based Recognition System Integrating Feature Extraction and Classification for English Handwritten

Handwriting recognition has been one of the active and challenging research areas in the field of image processing and pattern recognition. It has numerous applications that includes, reading aid for blind, bank cheques and conversion of any hand written document into structural text form. Neural Network (NN) with its inherent learning ability offers promising solutions for handwritten characte...

متن کامل

Off-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model

In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...

متن کامل

Distinction between Machine Printed Text and Handwritten Text in a Document

In many documents machine printed& handwritten texts are intermixed .Optical Character Recognition (OCR) techniques are different for machine printed and handwritten text, so it is necessary to separate these text before giving input to the OCR. In this paper we are proposing methodology for Hindi language. This methodology is based on structural features of text. Experimental results on a data...

متن کامل

Zone Based Features for Handwritten and Printed Mixed Kannada Digits Recognition

In the field of Optical Character Recognition (OCR), zoning is used to extract topological information from patterns. In this paper we propose Zone based features for recognition of the mixer of Handwritten and Printed Kannada Digits. A digit image is divided into 64 zones and pixel density is computed for each zone. This procedure is sequentially repeated for entire zone. Finally 64 features a...

متن کامل

Machine-printed and hand-written text lines identification

There are many types of documents where machine-printed and handwritten texts intermixedly appear. Since the optical character recognition (OCR) methodologies for machine-printed and handwritten texts are di€erent, to achieve optimal performance it is necessary to separate these two types of texts before feeding them to their respective OCR systems. In this paper, we present a machine-printed a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013